UNIguide: An Intelligent University Information Retrieval Chatbot Using Advanced Retrieval-Augmented Generation with Hybrid Search and Neural Re-Ranking
Authors: Pilli Karthik, Pechetti Banvi Swathmi, Manepalli Satya Sai Surya Ganesh, Paila Sai Datha, Barala Triveni, Sayyad Khalisha
Accessing timely and accurate institutional infor-mation remains a challenge for students and faculty in en-gineering colleges. This paper presents UNIguide, a domain-specific intelligent chatbot designed to answer university-related queries using a Retrieval-Augmented Generation (RAG) pipeline integrating hybrid retrieval, neural re-ranking, query validation, and semantic caching. Hybrid retrieval combines dense vector search with BM25 sparse keyword retrieval, improving recall for both semantic and lexical queries. A cross-encoder re-ranking stage further refines document ordering before answer generation by the Google Gemini large language model. Experimental results demonstrate improved retrieval accuracy and reduced latency through seman-tic caching.
Introduction
Engineering colleges generate vast amounts of information—admissions, curricula, faculty profiles, placement records, and events—that are difficult for students to navigate on static websites. Conversational agents allow users to ask natural language questions and receive precise answers. However, standard large language models (LLMs) lack domain-specific knowledge unless provided during inference.
Proposed Solution: UNIguide is a retrieval-augmented conversational system that combines information retrieval with generative LLMs to provide accurate, institution-specific responses. Key features include:
Query Validation: Detects queries outside the system scope, such as personal records or real-time data requests.
Hybrid Retrieval: Combines dense vector search (semantic embeddings) and BM25 sparse retrieval to maximize recall.
Neural Re-Ranking: Cross-encoder models rank retrieved documents based on semantic relevance.
Answer Generation: The final response is generated using the Google Gemini LLM.
Semantic Caching: Frequently asked queries are stored to improve response speed and efficiency.
Dashboard and chatbot interfaces provide interactive, real-time access to college-specific information.
Conclusion
UNIguide demonstrates how hybrid retrieval and neural re-ranking can significantly enhance the effectiveness of univer-sity information chatbots. By combining dense vector search with BM25 sparse retrieval, the system improves the accuracy and relevance of retrieved documents for both semantic and keyword-based queries. The integration of a cross-encoder neural re-ranking stage further refines the ranking of retrieved results, ensuring that the most contextually relevant informa-tion is provided to users.
The implementation of semantic caching plays an im-portant role in reducing response latency and improving system efficiency, particularly for frequently asked queries. By storing and reusing previously generated responses, the system minimizes redundant computations and provides faster interactions for users. Experimental evaluation shows that the hybrid retrieval with neural re-ranking approach achieves higher precision and ranking quality compared to using dense or sparse retrieval methods alone.
The proposed UNIguide system offers a scalable and effi-cient solution for handling institutional queries in academic environments. It simplifies access to university-related infor-mation such as admissions, curriculum details, faculty in-formation, and campus resources through a conversational interface, thereby improving the overall user experience for students and staff.
Future work will focus on expanding the system’s capa-bilities by incorporating multilingual support to accommodate diverse user populations. Additionally, integrating structured institutional databases and knowledge graphs could further improve answer accuracy and enable real-time information retrieval. Enhancing contextual conversation memory and in-corporating feedback-driven learning mechanisms may also improve the chatbot’s ability to handle complex multi-turn in-teractions. These advancements will help transform UNIguide into a more intelligent, adaptive, and comprehensive university information assistant.
References
[1] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” NeurIPS, 2020.
[2] V. Karpukhin et al., “Dense Passage Retrieval for Open-Domain Ques-tion Answering,” EMNLP, 2020.
[3] L. Wang et al., “Hybrid Passage Retrieval for Dense-Sparse Fusion,” SIGIR, 2021.
[4] A. Vaswani et al., “Attention Is All You Need,” NeurIPS, 2017.
[5] R. Nogueira and K. Cho, “Passage Re-ranking with BERT,” arXiv, 2019.
[6] E. Adamopoulou and L. Moussiades, ‘Chatbot Technology Overview,” IFIP AI Applications, 2020.
[7] J. Devlin et al., ‘BERT: Pre-training of Deep Bidirectional Transform-ers,” NAACL, 2019.
[8] T. Brown et al., ‘Language Models are Few-Shot Learners,” NeurIPS, 2020.
[9] OpenAI, ‘GPT-4 Technical Report,” 2023.
[10] Y. Bang et al., ‘Semantic Caching for LLM Applications,” EMNLP, 2023.
[11] J. Lin et al., ‘DeepImpact Sparse Retrieval,” arXiv, 2021.
[12] S. Robertson, ‘BM25 and Beyond,” Foundations and Trends in IR, 2009.
[13] HuggingFace, ‘Sentence Transformers,” 2021.
[14] LangChain Documentation, 2023.
[15] Pinecone Vector Database, Technical Docs, 2023.
[16] Google, ‘Gemini API Documentation,” 2024.
[17] B. Ranoliya et al., ‘Chatbot for University FAQs,” IEEE ICACCI, 2017.
[18] N. Hien et al., ‘Domain Question Answering using Knowledge Graph,” ICCSAMA, 2018.
[19] T. Alqahtani, ‘University Chatbots using GPT-3,” IEEE Access, 2023.
[20] M. Zaharia et al., ‘Compound AI Systems,” The Gradient, 2024.
[21] S. Young et al., ‘Dialogue Systems,” IEEE Signal Processing Magazine, 2018.
[22] Google Research, ‘Transformer Architecture,” 2017.
[23] OpenAI, ‘ChatGPT System Overview,” 2023.
[24] Meta AI, ‘LLaMA Language Model,” 2023.
[25] Microsoft, ‘AI Copilot Systems,” 2023.
[26] IBM Watson Chatbot Architecture, 2022.
[27] Facebook AI, ‘Dense Retrieval Models,” 2021.
[28] Stanford NLP Group, ‘Neural Ranking Models,” 2020.
[29] ACM Survey on Conversational AI, 2022.
[30] IEEE Survey on Chatbots, 2021.
[31] RAG Systems Survey, ACL 2022.
[32] Neural IR Models Survey, SIGIR 2021.
[33] AI Knowledge Systems Survey, Springer 2022.
[34] Conversational AI Frameworks Survey, Elsevier 2023.
[35] Hybrid Retrieval Systems Study, SIGIR 2022.
[36] LLM Applications Survey, IEEE Access 2023.
[37] NLP Systems Architecture Review, 2021.
[38] Neural Ranking Survey, 2022.
[39] AI Search Systems Overview, 2023.
[40] Retrieval Systems Handbook, Springer, 2020.
[41] P. Izacard and E. Grave, “Leveraging Passage Retrieval with Gener-ative Models for Open Domain Question Answering,” arXiv preprint arXiv:2007.01282, 2020.
[42] D. Chen, A. Fisch, J. Weston, and A. Bordes, “Reading Wikipedia to Answer Open-Domain Questions,” ACL, 2017.
[43] O. Khattab and M. Zaharia, “ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction,” SIGIR, 2020.
[44] J. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang, “REALM: Retrieval-Augmented Language Model Pre-training,” ICML, 2020.
[45] S. Robertson and H. Zaragoza, “The Probabilistic Relevance Framework: BM25 and Beyond,” Foundations and Trends in Information Retrieval, 2009.
[46] Y. Xiong et al., “Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval,” ICLR, 2021.
[47] A. Khandelwal et al., “Generalization through Memorization: Nearest Neighbor Language Models,” ICLR, 2020.
[48] T. Mikolov et al., “Efficient Estimation of Word Representations in Vector Space,” ICLR, 2013.
[49] T. Wolf et al., “Transformers: State-of-the-Art Natural Language Pro-cessing,” EMNLP System Demonstrations, 2020.
[50] K. Clark et al., “ELECTRA: Pre-training Text Encoders as Discrimina-tors Rather Than Generators,” ICLR, 2020.
[51] J. Lin, R. Nogueira, and A. Yates, “Pretrained Transformers for Text Ranking: BERT and Beyond,” arXiv, 2021.
[52] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT Networks,” EMNLP, 2019.
[53] Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” arXiv, 2019.
[54] Z. Yang et al., “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” NeurIPS, 2019.
[55] S. Thoppilan et al., “LaMDA: Language Models for Dialog Applica-tions,” arXiv, 2022.
[56] J. Wei et al., “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” NeurIPS, 2022.
[57] H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” arXiv, 2023.
[58] OpenAI, “GPT-3: Language Models are Few-Shot Learners,” NeurIPS, 2020.
[59] H. Peng et al., “Graph Retrieval-Augmented Generation,” arXiv, 2023.
[60] L. Gao et al., “Precise Zero-Shot Dense Retrieval without Relevance Labels,” ACL, 2022.